Task-specific minimum Bayes-risk decoding using learned edit distance
نویسندگان
چکیده
This paper extends the minimum Bayes-risk framework to incorporate a loss function specific to the task and the ASR system. The errors are modeled as a noisy channel and the parameters are learned from the data. The resulting loss function is used in the risk criterion for decoding. Experiments on a large vocabulary conversational speech recognition system demonstrate significant gains of about 1% absolute over MAP hypothesis and about 0.6% absolute over untrained loss function. The approach is general enough to be applicable to other sequence recognition problems such as in Optical Character Recognition (OCR) and in analysis of biological sequences.
منابع مشابه
Minimum Bayes Risk decoding and system combination based on a recursion for edit distance
In this paper we describe a method that can be used for Minimum Bayes Risk (MBR) decoding for speech recognition. Our algorithm can take as input either a single lattice, or multiple lattices for system combination. It has similar functionality to the widely used Consensus method, but has a clearer theoretical basis and appears to give better results both for MBR decoding and system combination...
متن کاملPhoneme-Lattice to Phoneme-Sequence Matching Algorithm Based on Dynamic Programming
A novel phoneme-lattice to phoneme-sequence matching algorithm based on dynamic programming is presented in this paper. Phoneme lattices have been shown to be a good choice to encode in a compact way alternative decoding hypotheses from a speech recognition system. These are typically used for the spoken term detection and keyword-spotting tasks, where a phoneme sequence query is matched to a r...
متن کاملMinimum Bayes-Risk Decoding for Statistical Machine Translation
We present Minimum Bayes-Risk (MBR) decoding for statistical machine translation. This statistical approach aims to minimize expected loss of translation errors under loss functions that measure translation performance. We describe a hierarchy of loss functions that incorporate different levels of linguistic information from word strings, word-to-word alignments from an MT system, and syntactic...
متن کاملBayes risk decoding and its application to system combination
Speech recognition is the task of converting an acoustic signal, which contains speech, to written text. The error of a speech recognition system is measured in the number of words in which the recognized and the spoken text differ. This work investigates and develops decoding and system combination approaches within the Bayes risk decoding framework with the objective of reducing the number of...
متن کاملMixture Model-based Minimum Bayes Risk Decoding using Multiple Machine Translation Systems
We present Mixture Model-based Minimum Bayes Risk (MMMBR) decoding, an approach that makes use of multiple SMT systems to improve translation accuracy. Unlike existing MBR decoding methods defined on the basis of single SMT systems, an MMMBR decoder reranks translation outputs in the combined search space of multiple systems using the MBR decision rule and a mixture distribution of component SM...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004